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Goals  of  Presentation 


Highlight  major  design  trade-offs  when  comparing  an 
ASIC  and  FPGA  solution  for  pulse  compression 

Provide  information  to  help  choose  the  right  tool  for  the 
right  job 
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Outline 


Overview  of  pulse  compression 

Comparison  of  computational  approaches 

Trade-offs  when  mapping  algorithm  to  an  ASIC  or 
FPGA 

Example  analysis 
Other  considerations 
Summary 
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Pulse  Compression  Overview 


Convolves  return  signal  with  complex  conjugate  of  transmit 
waveform 

Produces  peak  where  correlation  occurs  [1] 

-  Indicates  location  of  target  in  range 

-  Compressed  pulse  narrower  than  width  of  transmit  waveform  (higher 
range  resolution) 

-  Helps  radar  obtain  good  ranging  accuracy  with  low  instantaneous 
transmitter  power 

Ability  to  produce  narrow  peaks  depends  upon  transmit  waveform’s 

-  Bandwidth 

-  Duration  (length) 

Bandwidth  •  duration  =  Time  Bandwidth  Product  (TBP) 

Higher  TBP  [2] 


-  Finer  range  resolution 

-  Lower  instantaneous  transmitting  power 

-  Requires  more  computational  horsepower 
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Pulse  Compression  Illustration 


Pulse 

Compression 


(convolution 
with  complex 
conjugate  of 
transmit 
waveform) 


► 


Received  Signal  (t) 


Compressed  Received  Signal  (t) 


Two  targets  in  receive  window  hard  to  pinpoint  in  time 
(range) 

Targets  clearly  stand  out  after  compression 
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Approaches  to  Digital  Pulse  Compression 


Time  domain  convolution 

-  Filter  time  samples  of  receive  window  using  Finite  Impulse  Response 
(FIR)  filter 

-  Use  transmit  waveform  samples  as  tap  values  (number  of  taps  =  TBP) 

Frequency  domain  complex  multiplication 

-  FFT  (of  receive  window) 

-  Complex  multiplication  by  complex  conjugate  of  FFT  (transmit 
waveform) 

-  IFFT 

-  Overlap  by  TBP  if  sectioned  convolution* 

Both  approaches  mathematically  equivalent 

-  Convolution  (time)  multiplication  (frequency) 


*  For  DSP  implementation,  TBP  =  duration  •  sampling  rate 


CATAIiNARFSFARCH 
-J-  INCORPORATED 


A  PARAVANT  COMPANY 


Which  Approach  to  Use? 

•  Computational  efficiency  is  the  driving  factor 

•  Operations  defined  here  as  total  number  of  multiplies  and 
adds 

•  Number  of  FIR  operations  per  input  sample: 

=  8N-  2  where  N  =  number  of  taps 

•  Number  of  FFT  operations  per  input  vector: 

=  5  N  log 2  N  where  N  =  FFT  length 
Both  equations  assume  complex  data 
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Example:  TBP  =  256 


FIR  operations  =  8*  25 6-2  =  2046 

— »  2046  operations  need  to  happen  every  new  input  sample 


FFT  operations: 

— »  assume  an  FFT  length  of  twice  the  TBP 
5  *  512  *  log2  (512)  =  23,040 

— »  this  needs  to  happen  twice  (once  for  FFT,  once  for  IFFT)* 

=  2  *  23,040  =  46,080  operations 
— >  i.e.  for  every  input  vector,  46,080  operations  need  to  occur 
—>  assuming  sectioned  convolution,  overlap  input  vectors  by  TBP 
— >  thus,  effective  operations  per  input  sample: 

46,080  /  (  512  —  256  )  =  180  operations  per  new  input  sample 


FFT  approach  is  over  11  times  as  efficient  as  FIR  in  this  case! 


*  Time  domain  window  can  be  folded  into  first  pass  of  FFT 
Complex  multiplication  can  be  folded  in  with  first  pass  of  IFFT 
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Computational  Efficiency  of  FFT  vs.  FIR 


tE 


Comparison  of  Pulse  Compression  Operations 
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(FFT+CMUL+IFFT  Approach  w/  50%  Overlap) 
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Mapping  FFT s  into  Hardware 

•  ASIC  or  FPGA? 

-  ASIC:  Pathfinder-2  programmable  frequency  domain  vector 
processor 

-  FPGA:  Xilinx  VirtexE 

•  Trade  space  considerations: 

-  Radar  system  parameters 

•  TBP 

•  Number  of  samples  in  the  receive  window 

-  Number  of  bits  (precision  and  dynamic  range) 

-  Performance  (measured  in  Pulse  Repetition  Frequency) 
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Radar  System  Parameters 


FFT  size  determined  by  (  TBP  +  N.-1)  [3] 

-  TBP  =  number  of  samples  representing  transmit  pulse 

-  Ns  =  number  of  samples  in  receive  window 

=  [Pw  +  2(Rw/c)]-Fs 

Pw  =  pulse  width  of  transmit  waveform 
Rw  =  range  window  of  the  radar 
c  =  speed  of  light 

Fs  =  sampling  rate  of  digital  receiver  system 

Longer  FFTs  need  more 

-  Processing 

•  Larger  radix  cores 

•  More  passes  through  the  data 

-  Memory 

-  Bits 
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Number  of  Bits 


Today’s  high  speed  ADCs 

-  14  bits  up  to  100  MSPS 

-  12  bits  up  to  200  MSPS 

FFT  radix  computations  create  word  growth 

-  Radix  2  can  cause  growth  of  one  bit  just  due  to  additions 

-  Radix  4:  two  bits 

-  Radix  16:  four  bits 

Longer  FFT  lengths  require  more  radix  passes 

-  More  opportunity  for  growth 
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Floating  Point  vs.  Fixed  Point  [4] 


Floating  point 

-  Can  lead  to  truncation  or  rounding  errors  for  both  addition  and 
multiplication 

-  Overflows  highly  unlikely  due  to  very  large  dynamic  range 

-  Requires  more  hardware  resources  than  fixed  point  (adders  in 
particular) 

Fixed  point 

-  Truncation  or  rounding  errors  occur  only  for  multiplication 

-  Addition  can  lead  to  overflows 

•  Avoid  by  making  word  length  sufficiently  long  (may  not  be 
practical) 

•  Avoid  by  shifting  (scaling),  but  this  can  compromise  precision 
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Performance:  Pulse  Repetition  Frequency 

•  Defines  how  often  the  radar  transmits  pulses 

•  Higher  PRFs  imply 

-  Faster  update  rates  and  track  loop  closure 

-  Lower  Doppler  ambiguity 

-  Higher  range  ambiguity 

•  Time  between  transmit  pulses  sets  a  limit  on  the 
processing  time  available 

•  Conversely,  the  processing  time  required  for  a  given  FFT 
size  limits  the  achievable  PRF 
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Example  Analysis 

•  Assume  the  following  radar  system  parameters: 


Transmit  Pulse  Width 

10.2  usee 

A/D  Sampling  Rate 
(Baseband) 

10  MSPS 

Range  Window 

10  Km 
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Calculate  FFT  Size 


TBP  =  pulse  width  •  sampling  rate 

-  10.2  usee  •  10  MSPS  =  102  samples 

Ns  (number  of  samples  in  the  receive  window) 

-  [  10.2  usee  +  2  (  10  Km  /  c  )  ]  •  10  MSPS  =  769  samples 
FFT  size  =  102  +  769  -  1  =  870  samples  minimum 
Round  to  power  of  two:  1024  points 

Well  within  capabilities  of  Pathfinder-2  or  FPGA 
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Define  Word  Length 


Assume  14  bit  ADC 

Assume  one  bit  growth  per  radix  2  stage  (ten  stages  for  IK  FFT) 
Implies  word  length  of  24  bits  for  fixed  point  operations 

-  For  worst  case  input  to  FFT 

-  Assuming  rest  of  system  can  support  the  dynamic  range 

Fixed  point  implementation  must 

-  Define  sufficiently  large  word  (accumulator),  or 

-  Scale  data  input  to  each  radix  stage 

•  Blindly  shift  at  every  iteration  (Xilinx  IK  FFT  16  bit  core)  [5] 

•  Implement  “intelligent”  shifting  (e.g.  block  floating  point) 

Not  an  issue  for  floating  point  (Pathfinder-2) 


CAIMINA  RF5FARCH 
-J-  INCORPORATED 


A  PARAVANT  OOMFWIY 


Processing  Performance 


Algorithm:  window  — »  CFFT  — »  CMUL  — >  IFFT  for  IK  vector 
Pathfinder-2 

-  35.4  usee  at  133  MHz  clock 

-  Achievable  PRF  =1/35.4  usee  =  28.3  KHz  assuming  one  channel 

-  32  bit  IEEE  floating  point 

Xilinx  XCV2000E  sizing  estimate 

-  Assume  80  MHz  clock  rate 

-  Achievable  PRF  (with  75%  utilization)  ~  15  KHz  (one  channel) 

-  24  bit  fixed  point 

•  Overflow  still  a  concern 

•  24  bits  would  suffice  for  IK  FFT  alone  (most  applications) 

•  Does  not  provide  for  growth  due  to  IFFT 

•  Scaling  /  shifting  logic  will  still  be  needed 
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Additional  Design  Considerations 


Part  count 

-  Minimum  Pathfinder-2  solution  requires 

•  Pathfinder-2  ASIC 

•  Three  external  address  generators 

•  Three  SRAM  banks 

•  Small  FPGA  to  act  as  a  controller 

-  Entire  solution  could  fit  in  XCV2000E 

Parts  costs  (estimated) 

-  Pathfinder-2  solution  =  $1,500 

-  Xilinx  XCV2000E  =  $2,900 

Design  flexibility  and  development 

-  What  if  you  decide  to  change  FFT  sizes? 

-  What  if  you  want  to  match  against  multiple  transmit  waveforms? 


CAIMINA  RESEARCH 
-J-  INCORPORATED 


A  PARAVANT  COMFWIY 


Summary 


Less  demanding  pulse  compression  application  good  match  for 
FPGAs 

More  demanding  system  requirements  quickly  drive  solution 
towards  a  Pathfinder-2  type  of  approach 


Pulse  Compression  Application  (IK  Vector  Size) 

Pathfinder-2  (ASIC) 

XCV2000E  (FPGA) 

Higher  PRFs 

Lower  PRFs 

Higher  Parts  Count 

Lower  Parts  Count 

Less  Expensive 

More  Expensive 

Minimal  Precision  and  Dynamic  Range 
Concerns 

Valid  Dynamic  Range  and  Precision 
Concerns 

Easily  Scalable  to  More  Demanding 
Algorithms 

Not  Easily  Scalable  to  More 
Demanding  Algorithms 
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